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1. 



Basis of the report 



Description, Pages 

^54 as originally filed 



Claims, Numbers 

1-26 

Drawings, Sheets 

1-7 



received on 16.08.2004 with letter of 16.08.2004 



as originally filed 

These elements were available or furnished to this Authority in the following language: , which is: 

□ the language of a translation furnished for the purposes of the international search (under Rule 23.1 (b)). 

□ the language of publication of the international application (under Rule 48.3(b)). 

□ the language of a translation furnished for the purposes of international preliminary examination (under 
Rule 55.2 and/br 55.3). 

3 With reaard to any nucleotide and/or amino acid sequence disclosed in the international application, the 
3 - Ztr^S^JpM^e^mlnation was carried out on the basis of the sequence listing. 

□ contained in the international application in written form. 

□ filed together with the international application in computer readable form. 

□ furnished subsequently to this Authority in written form. 

□ furnished subsequently to this Authority in computer readable form. 

□ The statement that the subsequently furnished written sequence listing does not go beyond the disclosure 
in the international application as filed has been furnished. 

□ The statement that the information recorded in computer readable form is identical to the written sequence 
listing has been furnished. 

4. The amendments have resulted in the cancellation of: 

□ the description, pages: 

□ the claims, Nos.: 

□ the drawings, sheets: 
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5 □ This report has been established as if (some of) the amendments had not been made, since they have 
been considered to go beyond the disclosure as filed (Rule 70.2(c)). 

(Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this 
report.) 

6. Additional observations, if necessary: 

V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 

1. Statement 



Novelty (N) 


Yes: 


Claims 


9-19, 25, 26 




No: 


Claims 


1-8, 20-24 


Inventive step (IS) 


Yes: 


Claims 


none 




No: 


Claims 


1-26 


Industrial applicability (IA) 


Yes: 


Claims 


1-26 




No: 


Claims 


none 



2. Citations and explanations 
see separate sheet 
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Reference is made to the following documents: 

D2: 'Biological Fluid Analysis Using Distance Outlier Detection', WO 97 0641 8 A 
(BOEHRINGER MANNHEIM CORP) 20 February 1997 (1997-02-20) 

D4: HUBERT ET AL, 'A fast method for robust principal components with applications 
to chemometrics', Chemometrics and Intelligent Laboratory Systems, 2002, 60, 101- 
111. 

1 . The document D4 is for the first time mentioned in this International Preliminary 
Examination Report. 

2. The present application does not meet the criteria of Article 33(1 ) PCT, because the 
subject-matter of claims 1 and 20 is not new in the sense of Article 33(2) PCT. 

2.1 The document D2 is regarded as being the closest prior art to the subject-matter of 
claim 1 , and discloses (the references in parentheses applying to this document): 

A method of verifying and controlling assays for the analysis of biological fluid analytes ( 
"collecting high quality data", "removing outliers resulting from statistical or systematic 
errors", page 2, par. 1 ; "to ensure accurate and consistent results", page 4, par. 2) 

by means of statistical process control (using Hotelling's T statistics to identify 
degradation/outliers anticipates a statistical process control and calibration data set 
corresponds to the historical data set; page 9, par. 2) comprising the following steps: 

a) defining a reference data set ("calibration data", page 26, par. 1 ) 

b) defining a test data set ("unknown samples", page 26, par. 1) 

and reducing the data dimensionality of the reference and test data set by means of 
robust embedding of the values into a lower dimensional representation (page 7, par 4; 
"selecting optimal number of factors with robust predictive abilities", page 23. par. 2; 
Figure 4) 

c) determining the statistical distance between the reference data set and test data set 
or elements or subsets thereof (Hotelling's T statistics, Mahalanobis distance or Robust 
distance, page 8, par. 3 - page 9, par.2; page 26, par. 1) 
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d) identifying individual elements or subelements of the test dataset which have a 
statistical distance larger than that of a predetermined value (page 9, par.2; page 26, 
par. 1; claim 11). 

2.2 It is to be noted that D2 does not explicitly refer to "nucleic acids"; instead, it 
discloses "biological fluid analytes". Such an expression does not exclude the possibility 
of controlling nucleic acid variations since analytes are substances being measured in 
an analytical procedure. Moreover, the feature of representing the elements of the 
reference and test sets by nucleic acid measurements does not relate to the method of 
claim 1, since monitoring the test data by measuring deviations from the reference data 
is independent of the nature of variables. 

2.3 It is also to be noted that merely using the word "robust" does not signify any 
features providing modifications to a method of dimension reduction. Since no 
additional specific limitations are implied, the method of D2 employing principal 
component analysis which is a sound and robust method for capturing the direction of 
variations in the data for dimension reduction anticipates claim 1 . 

2 4 Even if the expression "robust principle component analysis" could be taken in the 
sense of comprising the specific method steps of robust PCA (claim 9) the following 
should be observed: 

Claim 1 would define the use of "robust principle component analysis" for performing 
the step of dimensionality reduction. The method of D2 performs classical principal 
component analysis where the eigenvalues obtained from the covariance matrix are 
sorted and those accounting for most of the variation in the data are selected. However, 
it is generally known that when the data contains outliers, the covariance matrix is 
impaired and the eigenvalues might not describe the data correctly. When trying to 
implement the method of D2, the person skilled in the art is confronted with this 
problem since D2 discloses the presence of outliers. In order to overcome this problem, 
the skilled person would look for a method to improve dimensionality reduction in the 
presence of outliers in the relevant field. 

There, the skilled person will come across D4, which discloses using robust principle 
component analysis for reduction of dimensionality in case the data involves anomalous 
observations. 

The skilled person will incorporate the teaching of D4 in the method of D2 which 
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already discloses constructing a robust model to detect the outliers using PCA and 
employ robust PCA for dimensionality reduction without exercising any inventive skills. 

As a result, even with such hypothetical interpretation of claim 1 , its subject-matter 
would not involve an inventive step in the sense of Article 33(3) PCT. 

2.5 The subject-matter of claim 20 is the same as that of claim 1 in the form of steps of 
a computer program corresponding to the steps of the method. Therefore, claim 20 is 
also not new over the disclosure of D2 in the sense of Article 33(2) PCT. 

3. Dependent claims 2-19 and 21-26 do not contain any features which, in combination 
with the features of any claim to which they refer, meet the requirements of the PCT in 
respect of novelty and/or inventive step. 

4. The claims are susceptible of industrial application in the sense of Article 33(4) PCT 
since they relate to the technical field of bioinformatics. 
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2. A method according to claim 1 when said nucleic acid variations are cytosine 
methylation variations. 

3. A method according to claims 1 and 2 wherein said statistical process control 
is taken from the group comprising multivariate statistical process control and 
univariate statistical process control. 
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data sot o t o lemcnta or subsets thereof -' 

■ d) id c n t ifymgfadmdual olemcnt o ovsubac t s of the test datnsot which have a - 
ct ntifitifi n l riifftaT"y than thnt of apred ( rtcrmincdvalue. l 
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of lobuat embedding of the values into a lower d hn m s ional repre s entation^ 
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1 . "A method of verifying and controlling assays for the 
analysis of nucleic acid variations by means of 
statistical process control comprising the following 

5 steps 

a) defining a reference data set 

b) defining a test data set and reducing the data 
dimensionality of the reference and test data set by 
means of robust embedding of the values into a lower 

10 dimensional representation 

c) determining the statistical distance between the 
reference data set and test data set or elements or 

..subsets thereof 

d) identifying individual elements or subelements of 
15 the test dataset which have a statistical distance 

larger than that of a predetermined value. 
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^ -fc- The method according to claim^vherein step b) is earned out by calculating 
the embedding space using one or both of the reference and the test data sets. 

£ -9r The method according to one of claims-4 to further comprising, 
5 e) further investigating said identified elements or subsets of the test dataset 

to determine the contribution of individual variables to the determined 
statistical distance. 

f.-8r The method according to one of claims^to^ farther comprising, 
10 e) excluding said identified experiments or batches thereof from farther 

analysis. 

9f«£. The method of claim 4- wherein in step d) said statistical distance is calculated 
by means of one or more methods taken from the group consisting the 

15 Hotelling's T 2 distance between a single test measurement vector and the 

reference data set, the Hotelling'-T 2 distance between a subset of the test data 
set and the reference data set, the distance between the covariance matrices of 
a subset of the test data set and the covariance matrix of the reference set, 
percentiles of the empirical distribution of the reference data set and 

20 percentiles of a kernel density estimate of the distribution of the reference 

data set, distance from the hyperplane of a nu-SVM, estimating the support of 
the distribution of the reference data set 

-Hfc The method according to one of claims|etemri5 and Of wherein the data 
25 dimensionality reduction is carried out by means of principle component 

analysis. 

5, 44". The method according to one of claims l olatm 5, 6 and 10/ wherein the data 
dimensionality reduction step comprises the following steps 
30 i) Projecting the data set by means of robust principle component analysis 

ii) Removing outliers from the data set according to their statistical distances 
calculated by means of one or more methods taken from the group consisting 
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of: Hotelling's T 2 distance; percentiles of the empirical distribution of the 
reference data set; Percentiles of a kernel density estimate of the distribution 
of the reference data set and distance from the hyperplane of a nu-SVM, 
estimating the support of the distribution of the reference data set 
5 iii) Calculating the embedding projection by standard principle component 

analysis and projecting the cleared or the complete data set onto this basis 
vector system. 



10 43: The method according to one of claims** to H* wherein at least one of the 

variables measured in steps a) and b) is determined according to the 
methylation state of the nucleic acids. 



15 The method according to one of claims 4-to ^ wherein at least one of the 

variables measured in step a) and b) is determined by the environment used 
to conduct the assay. 

3 

<iZ. -H>. The method according to one of claims^* to H- wherein said data sets 
20 comprises one or more variables selected from the group comprising mean 

background/baseline values; scatter of fee background/baseline values; scatter 
of fee foreground values, geometrical properties of the array, percentiles of 
background values of each spot and positive and negative assay control 
measures. 

stZ.lgz A method according to one of claims'** to W^wherein the reference data set is 
the complete series of experiments being analysed, {(mako it explicit irrthc- 
■ deocription that - feo tost sot - can bo a subset of fee reference data act)| 

30 */f;Hr. A method according to one of claims 4 to 14 wherein the reference data set is 
derived from experiments carried out separately to those of fee test data set. 
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■ sfS.-&. A method according to one of claims'* to W-wherein the reference data set is 
derived from a set of experiments wherein the value of each variable of each 
experiment is either within a predetermined limit or optimally controlled. 

5 -iC. V8-. A method according to one of claims -4 to 47 further comprising the 

generation of a document comprising said elements or subsets of the test data 
determined according to step d) of claim 4. 

... 

-*^*9: A method according to claim 48-wherein said document further comprises 
10 the contribution of individual variables to the determined statistical distance. 

A method according to claims t8 and wherein said document is stored on 
a computer readable format. 

1 5 43. A method according to one of claims 1 to 20- wherein said method is 
implemented by means of a computer. 

£4-22. A computer program product for the verifying and controlling assays for the 
analysis of nucleic acid variations comprising 
20 a) a computer code that receives as input a reference data set 

b) a computer code that receives as input a test data set 
. c) a computer code that determines the statistical distance between the 
reference data set and test data set or elements or subsets thereof 
d) a computer code that identifies individual elements or subsets of the test 
25 dataset which have a statistical distance larger than that of a predetermined 

value 

a comnuter readable medium mat stores the computer code. 

2 3 . The computer pr ogram product of olaim 22 further comprising d;~c~s.'~«( ~/»-«"* ; 
30 f ) a compu tgr-PfH" j! > lf> » reduees-fee data dimencionalitv of the r e ference and- 
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a* 

7/f. -24: The computer program product of claim 22 characterised in that the 

embedding space is calculated using one or both of the reference and the test 
5 data sets. 

jW.-SS; The computer program product of claims SSrlu 24- farther comprising, 

g) a computer code thai investigates said identified elements or subsets of the 
test dataset to determine the contribution of individual variables to the 
10 determined statistical distance. 

2 1 . 26. The computer program product of claims 22 to 25- wherein said statistical 

distance is calculated by means of one or more methods taken from the group 
consisting the HotelKng's T 2 distance between a single test measurement 

15 vector and the reference data set, the HoteDmg'-t 2 distance between a subset 

of the test data set and the reference data set, the distance between the 
covariance matrices of a subset of the test data set and the co variance matrix 
of the reference set, percentiles of the empirical distribution of the reference 
data set and percentiles of a kernel density estimate of the distribution of the 

20 reference data set, distance from the hyperplane of a nu-SVM, estimating the 

support of the distribution of the reference data set 

2^ *> *3 

ZV. Srh The computer program product of claim s S3 and 2^ wherein the data 

dimensionality reduction is carried out by means of principle component 
25 analysis. 

<l5. 3$rThe computer program product of claims ]23, 24 and 21 wherein the data 
dimensionality reduction step comprises the following steps 
i) Projecting the data set by means of robust principle component analysis 
30 ii) Removing outliers from the data set according to their statistical distances 

calculated by means of one or more methods taken from the group consisting 
o£ Hotelling's T 2 distance; percentiles of the empirical distribution of the 
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reference data set; Percentiles of a kernel density estimate of the distribution 
of the reference data set and distance from the hyperplane of a nu-SVM, 
estimating the support of the distribution of the reference data set 
iii) Calculating the embedding projection by standard principle component 
5 analysis and projecting the cleared or the complete data set onto this basis 

vector system. 

2£* The computer program product of claims 2£ to -3ft further comprising a 
computer code that generates a document comprising said elements or 
1 0 subsets of the test data determined according tojstegfd) of claim 32r 
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